Previous work has shown the potential of deep learning to predict renal obstruction using kidney ultrasound images. However, these image-based classifiers have been trained with the goal of single-visit inference in mind. We compare methods from video action recognition (i.e. convolutional pooling, LSTM, TSM) to adapt single-visit convolutional models to handle multiple visit inference. We demonstrate that incorporating images from a patient's past hospital visits provides only a small benefit for the prediction of obstructive hydronephrosis. Therefore, inclusion of prior ultrasounds is beneficial, but prediction based on the latest ultrasound is sufficient for patient risk stratification.
translated by 谷歌翻译
Neurosymbolic Programming (NP) techniques have the potential to accelerate scientific discovery. These models combine neural and symbolic components to learn complex patterns and representations from data, using high-level concepts or known constraints. NP techniques can interface with symbolic domain knowledge from scientists, such as prior knowledge and experimental context, to produce interpretable outputs. We identify opportunities and challenges between current NP models and scientific workflows, with real-world examples from behavior analysis in science: to enable the use of NP broadly for workflows across the natural and social sciences.
translated by 谷歌翻译
通过一系列联邦举措和命令,美国政府一直在努力确保美国在AI中的领导。这些广泛的战略文件影响了美国空军美国部(DAF)等组织。DAF-MIT AI加速器是DAF和MIT之间的一项计划,以弥合AI研究人员与DAF任务要求之间的差距。DAF-MIT AI加速器支持的几个项目正在开发公共挑战问题,这些问题解决了许多联邦AI研究的重点。这些挑战是通过公开可用的大型AI-Ready数据集,激励开源解决方案,并为可以激发进一步研究的双重使用技术创建需求信号,来针对优先事项。在本文中,我们描述了正在开发的这些公共挑战以及它们的应用如何促进科学进步。
translated by 谷歌翻译
现实世界的设计问题是约束,目标和功能的混合组合。探索这些问题空间可以定义为多标准探索(MCX)问题,其目标是在许多目标中产生一组具有高性能的不同解决方案,同时避免在任何目标中均表现较低。质量多样性算法产生所需的设计变化,但通常仅考虑一个目标。我们提出了一个新的排名T-Domino,专门设计用于处理MCX问题中的多个目标。 T-Domino将相对于档案中的其他解决方案进行排名,从而使表现平衡的个人相对于以其他目标为代价的人来表现出色。在每个地图 - 精英垃圾箱中仅保留一个平衡的解决方案可以保持档案的可访问性 - 这是设计探索的强大资产。我们在一组易于理解的基准测试中说明了我们的方法,并在许多目标现实世界的架构案例研究中展示了其潜力。
translated by 谷歌翻译
我们制定了一种评估给定两组图像的生成网络性能的度量。当前用于执行此操作的流行绩效指标是Fr \'Echet Inception距离(FID)。 FID假设使用Inception-V3的倒数第二层遵循高斯分布来特征的图像,如果我们希望将FID用作度量标准,则不会违反这种假设。但是,我们表明,ImakeNet数据集的Inception-V3特征不是高斯。特别是,每个边缘都不是高斯。为了解决这个问题,我们使用高斯混合模型(GMM)对特征图像进行建模,并计算限于GMM的2-Wasserstein距离。我们通过使用Inception-V3(或其他分类器)在两组图像上定义了一个称为WAM的性能度量,以表征图像,估算两个GMM,并使用受限的$ 2 $ - WASSERSTEIN距离比较GMMS。我们通过实验表明WAM比FID的优势,包括FID比WAM对不可察觉的图像扰动更敏感。通过建模从Inception-V3作为GMM获得的非高斯特征并使用GMM度量,我们可以更准确地评估生成网络性能。
translated by 谷歌翻译
联合学习(FL)作为边缘设备的有希望的技术,以协作学习共享预测模型,同时保持其训练数据,从而解耦了从需要存储云中的数据的机器学习的能力。然而,在规模和系统异质性方面,FL难以现实地实现。虽然有许多用于模拟FL算法的研究框架,但它们不支持在异构边缘设备上进行可扩展的流程。在本文中,我们呈现花 - 一种全面的FL框架,通过提供新的设施来执行大规模的FL实验并考虑丰富的异构流程来区分现有平台。我们的实验表明花卉可以仅使用一对高端GPU在客户尺寸下进行FL实验。然后,研究人员可以将实验无缝地迁移到真实设备中以检查设计空间的其他部分。我们认为花卉为社区提供了一个批判性的新工具,用于研究和发展。
translated by 谷歌翻译
We demonstrate a proof-of-concept of a large language model conducting corporate lobbying related activities. We use an autoregressive large language model (OpenAI's text-davinci-003) to determine if proposed U.S. Congressional bills are relevant to specific public companies and provide explanations and confidence levels. For the bills the model deems as relevant, the model drafts a letter to the sponsor of the bill in an attempt to persuade the congressperson to make changes to the proposed legislation. We use hundreds of ground-truth labels of the relevance of a bill to a company to benchmark the performance of the model, which outperforms the baseline of predicting the most common outcome of irrelevance. However, we test the ability to determine the relevance of a bill with the previous OpenAI GPT-3 model (text-davinci-002), which was state-of-the-art on many language tasks until text-davinci-003 was released on November 28, 2022. The performance of text-davinci-002 is worse than simply always predicting that a bill is irrelevant to a company. These results suggest that, as large language models continue to improve core natural language understanding capabilities, performance on corporate lobbying related tasks will continue to improve. We then discuss why this could be problematic for societal-AI alignment.
translated by 谷歌翻译
In the past years, deep learning has seen an increase of usage in the domain of histopathological applications. However, while these approaches have shown great potential, in high-risk environments deep learning models need to be able to judge their own uncertainty and be able to reject inputs when there is a significant chance of misclassification. In this work, we conduct a rigorous evaluation of the most commonly used uncertainty and robustness methods for the classification of Whole-Slide-Images under domain shift using the H\&E stained Camelyon17 breast cancer dataset. Although it is known that histopathological data can be subject to strong domain shift and label noise, to our knowledge this is the first work that compares the most common methods for uncertainty estimation under these aspects. In our experiments, we compare Stochastic Variational Inference, Monte-Carlo Dropout, Deep Ensembles, Test-Time Data Augmentation as well as combinations thereof. We observe that ensembles of methods generally lead to higher accuracies and better calibration and that Test-Time Data Augmentation can be a promising alternative when choosing an appropriate set of augmentations. Across methods, a rejection of the most uncertain tiles leads to a significant increase in classification accuracy on both in-distribution as well as out-of-distribution data. Furthermore, we conduct experiments comparing these methods under varying conditions of label noise. We observe that the border regions of the Camelyon17 dataset are subject to label noise and evaluate the robustness of the included methods against different noise levels. Lastly, we publish our code framework to facilitate further research on uncertainty estimation on histopathological data.
translated by 谷歌翻译
In large-scale machine learning, recent works have studied the effects of compressing gradients in stochastic optimization in order to alleviate the communication bottleneck. These works have collectively revealed that stochastic gradient descent (SGD) is robust to structured perturbations such as quantization, sparsification, and delays. Perhaps surprisingly, despite the surge of interest in large-scale, multi-agent reinforcement learning, almost nothing is known about the analogous question: Are common reinforcement learning (RL) algorithms also robust to similar perturbations? In this paper, we investigate this question by studying a variant of the classical temporal difference (TD) learning algorithm with a perturbed update direction, where a general compression operator is used to model the perturbation. Our main technical contribution is to show that compressed TD algorithms, coupled with an error-feedback mechanism used widely in optimization, exhibit the same non-asymptotic theoretical guarantees as their SGD counterparts. We then extend our results significantly to nonlinear stochastic approximation algorithms and multi-agent settings. In particular, we prove that for multi-agent TD learning, one can achieve linear convergence speedups in the number of agents while communicating just $\tilde{O}(1)$ bits per agent at each time step. Our work is the first to provide finite-time results in RL that account for general compression operators and error-feedback in tandem with linear function approximation and Markovian sampling. Our analysis hinges on studying the drift of a novel Lyapunov function that captures the dynamics of a memory variable introduced by error feedback.
translated by 谷歌翻译
While the capabilities of autonomous systems have been steadily improving in recent years, these systems still struggle to rapidly explore previously unknown environments without the aid of GPS-assisted navigation. The DARPA Subterranean (SubT) Challenge aimed to fast track the development of autonomous exploration systems by evaluating their performance in real-world underground search-and-rescue scenarios. Subterranean environments present a plethora of challenges for robotic systems, such as limited communications, complex topology, visually-degraded sensing, and harsh terrain. The presented solution enables long-term autonomy with minimal human supervision by combining a powerful and independent single-agent autonomy stack, with higher level mission management operating over a flexible mesh network. The autonomy suite deployed on quadruped and wheeled robots was fully independent, freeing the human supervision to loosely supervise the mission and make high-impact strategic decisions. We also discuss lessons learned from fielding our system at the SubT Final Event, relating to vehicle versatility, system adaptability, and re-configurable communications.
translated by 谷歌翻译